PatternHunter: faster and more sensitive homology search
نویسندگان
چکیده
MOTIVATION Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas decreasing seed size slows down computation. RESULTS We present a new homology search algorithm 'PatternHunter' that uses a novel seed model for increased sensitivity and new hit-processing techniques for significantly increased speed. At Blast levels of sensitivity, PatternHunter is able to find homologies between sequences as large as human chromosomes, in mere hours on a desktop. AVAILABILITY PatternHunter is available at http://www.bioinformaticssolutions.com, as a commercial package. It runs on all platforms that support Java. PatternHunter technology is being patented; commercial use requires a license from BSI, while non-commercial use will be free.
منابع مشابه
Modern Homology Search
Dynamic programming [1] has full sensitivity, but too slow for large scale homology search. FASTA / BLAST type of heuristics [2] trade sensitivity for speed. Can we have both sensitivity and speed? We present the mathematical theory of optimized spaced seeds which allows modern homology search to achieve high sensitivity and high speed simultaneously. The spaced seed methodology is implemented ...
متن کاملPatternhunter Ii: Highly Sensitive and Fast Homology Search
Extending the single optimized spaced seed of PatternHunter to multiple ones, PatternHunter II simultaneously remedies the lack of sensitivity of Blastn and the lack of speed of Smith-Waterman, for homology search. At Blastn speed, PatternHunter II approaches Smith-Waterman sensitivity, bringing homology search technology back to a full circle.
متن کاملFiltration of String Proximity Search via Transformation
The problem of proximity search in biological databases is addressed. We study vectortransformations and conduct the application of DFT(Discrete Fourier Transformation) andDWT(Discrete Wavelet Transformation, Haar) dimensionality reduction techniques for DNAsequence proximity search to reduce the search time of range queries. Our empirical results on anumber of Prokaryote and Eu...
متن کاملHomology Search Methods
Homology search methods have advanced substantially in recent years. Beginning with the elegant Needleman-Wunsch and Smith-Waterman dynamic programming techiques of the 1970s, algorithms have been developed that were appropriate for the data sets and computer systems of their times. As data sets grew, faster but less sensitive heuristic algorithms, such as FASTA and BLAST, became a dominant for...
متن کاملA Sensitive Sequence Comparison Method
Biologists highly rely on good algorithms to find the homologous regions in bimolecular sequences. One advanced homology search program, PatternHunter, has been developed in 2002. Unlike the well-known program Blast using a consecutive model, it benefited from gapped (nonconsecutive) model to work better. By observing and analyzing some significant properties of gapped-models, we propose a new ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 18 3 شماره
صفحات -
تاریخ انتشار 2002